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The CENTER FOR THE STUDY OF EVALUATION OF INSTRUCTIONAL 
PROGRAMS is engaged in research that will yield new ideas 
and new tools capable of analyzing and evaluating instruc- 
tion. Staff members are creating new ways to evaluate con- 
tent of curricula, methods of teaching and the multiple 
effects of both on students. The CENTER is unique because 
of its access to Southern California 1 s elementary, second- 
ary and higher schools of diverse socio-economic levels 
and cultural backgrounds. 
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COMMENTS ON PROFESSOR WILEY S PAPER ENTITLED 
"DESIGN AND ANALYSIS OF EVALUATION STUDIES" 

Theodore Husek 

I find myself much more interested and stimulated by the latter 
sections of David Wiley’s paper than by the introduction, the defini- 
tions, and the refinements of terminology. I know the language 
framework is necessary, but right now I find, so far as my own work 
is concerned, I am not as interested as perhaps I should be in the 
definitional problem. 

I see the major task of the methodology in evaluation as being 
the development of new ways of helping the content specialist con- 
struct and evaluate educational products. As part of this task we 
need to do a better job of data collection and data analysis. 

In this context I think- the paper brought out some extremely 
important issues. We should be interested in the distributions of 
scores on tests as well as the mean. At the same time I think we 
need to use the traditional item more, and also reexamine the nature 
of the items which we use in evaluation studies . We have to examine 
new indices, whether or not they are obtrusive or unobtrusive. 

Wiley’s point about paying attention to the unit of study is 
also important. We seldom pay as much attention as we should to 
whether we are studying students, classes, teachers, or school sys- 
tems. Many times we really are not interested in the individual 



student, and in these cases I feel that item sampling may provide 
an immense break-through in data collection procedures . As a foot- 
note I would like to say that I think it unfortunate that this 
particular term "item sampling” got started--I do not think it really 
represents what is happening; there is more involved than sampling 
just items. 

If we do not have to ask, every student every question in our 
study, then it may be possible to begin to obtain data on the multi- 
tude of measures that we all seem to think are important. As a 
simple example, in the classroom situation we can use tests which in 
part serve to help us grade the students, in part help us to judge 
the course, and also give us a little data about anything we might 
be interested in. 

With respect to item sampling, I feel that there are at least 
two important questions for which we do not have answers . The first 
of them is the context effect. One way to use item sampling is to 
give each student one item and to give different items to different 
students, but I have as yet no idea of the physical effect this has 
on the student. If you give him one item out of context, will he 
respond differently than if the item were in the context of similar 
items, or, for that matter, in the context of different items? 

Ken Sirotnik and I are now performing a study to examine this issue. 
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My other question about item sampling concerns its optimal 
use. Given a set of subjects under certain circumstances and items 
with certain characteristics and various test conditions, what is 
the optimal number of items to give to how many students? Cur- 
rently Dr. Sirotnik and I are also planning a computer simulation 
study to examine this messy issue. 

The item sampling research I have been pursuing has reminded 
me of another dimension which must be considered in evaluation 
studies. In one of two empirical tryouts of item sampling pro- 
cedures we obtained an item matrix sample that produced a negative 
variance for the population from which we were sampling. We finally 
decided that there was no mistake in the formula and discovered the 
negative variance would be produced by item matrix samples with 
negative coefficient outputs . 

This led us to some serious thinking about the nature of the 
collection of items from which we had samples. We were led, for 
one thing, to see the need for a special kind of homogeneity in the 
population from which you are taking your items - -not necessarily a 
homogeneity in the coefficient alpha sense --but some other kind. 

The main conclusion we reached was that we had to pay more attention 
to the purposes of the test than we had thought necessary, and this 
is another dimension of extreme importance in evaluation. 

Not only is the content of the test important, not only is the 
unit of study, but also the nature of the test is important. Do we 
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want an achievement test with maximum variance? Do we want a test 
to measure change? Should the test be course -vocabulary free? 

The question here is one of defining criteria for the various 
purposes. I do not think that we need a new statistics for any of 
the points I have made up to now; I do not think we need a new 
test theory. I do think we have to be a lot clearer about what we 
are trying to do. 

As my last point I would like to bring up something I do not 
know how to handle at all. I will use Dr. Pophani as an example, 
largely because we have talked about his particular issue. He is 
trying to train product researchers --people who will be near-techni- 
cians and who will hopefully produce better instructional programs. 

It is one thing to say that this can be done by just using the rules 
of the game that we already know, but it would be silly not to try 
to learn how better products are built. Given that the need to 
produce a product includes the possibility of performing an intermi- 
nable series of experiments which examine each variable, how will we 
be able to collect and use data from the ongoing developmental pro- 
cess to help understand product development and, most of all, improve 



